Dependency graph #659

polinabinder1 · 2025-01-24T22:41:03Z

This generates a dependency graph between the bionemo sub-packages. Additionally, this will check that the pyproject.toml files agree with what's in the source files. This will also parse the source files to make sure that dependencies are correct between the bionemo sub-packages.

Signed-off-by: Polina Binder <[email protected]>

codecov-commenter · 2025-01-25T00:39:30Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 86.75%. Comparing base (60a6dad) to head (d4c7b08).

✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #659   +/-   ##
=======================================
  Coverage   86.75%   86.75%           
=======================================
  Files         118      118           
  Lines        7059     7059           
=======================================
  Hits         6124     6124           
  Misses        935      935

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pstjohn · 2025-01-25T15:08:14Z

scripts/dependency_graph.py

+    return pyproject_files
+
+
+def parse_dependencies(pyproject_path):


Do you think we could also parse tach.toml and possibly warn if the dependency graphs are different? The tach check actually enforces this separation during CI, so it's probably more accurate.

Is there a check in place to ensure that the tach.toml and pyproject.toml files are up-to-date and valid in terms of dependencies? For example, does PyPI installation and importing all subpackages automatically verify this?

We could implement regular checks or enforcement in the CI pipeline. If the above method isn't sufficient, we can create a script that parses the project.toml files of the main project and its subpackages, extracts the import paths used in Python scripts under the src directory of each subpackage, and verifies all imports starting with from bionemo.

Dorota -- I suggest we constrain scope for now to just drawing the dependency graph.

I agree it's a good idea to do what you're describing, but it would increase the scope of this substantially and at the moment there's other stuff we gotta do :)

@trvachov , @polinabinder1 , therefore, it would be good to ticket that in github issue and JIRA and add a warning note that this method does not ensure correctness of the dependency graph and that the proposed method in my comment or alternative tool should be added to complete this task.

The new code gets the tach.toml dependencies and checks that the code imports in the sub-packages are correct based on what is in the pyproject.toml and tach.toml files.

dorotat-nv

I am not sure is pyproject.toml is the up to date source of dependency information anymore. I am not sure how it is maintained.

If no, I thin we should implement a script that parses dependency graph for subpackages from the scripts, ie parse the project.toml files of the main project and its subpackages, extract the import paths used in Python scripts under the src directory of each subpackage, and verifies all imports starting with from bionemo.

trvachov · 2025-01-27T23:34:23Z

To me this looks like a good start just need to document "how to run" script in github PR description. I don't necessary need this to do any more than in currently does ( @pstjohn @dorotat-nv , I suggest we constrain our review just to "graph drawing" rather than any sort of py file parsing + CI enforcement.

dorotat-nv

Could this script be relocated under internal/scripts?

Signed-off-by: Polina Binder <[email protected]>

polinabinder1 · 2025-01-28T22:04:52Z

Could this script be relocated under internal/scripts?

Done!

pstjohn

This looks great, the module was laid out nicely and you have a lot of clear, reusable functions.

I do think we need unit tests for these functions though. Copilot could likely do that pretty quickly. Ideally any public function / class / method should get a test for all its expected behavior and edge cases, but even just a basic test for each of these functions would be great

I also think we should add those resulting images to our documentation; along with a command of how to run this to regenerate them.

Unfortunately putting this in internal/scripts means it's harder to write tests for; we don't execute pytests in that subdirectory. Maybe this could live in bionemo-fw? Or we could add that directory to our pytest call. But I'm worried that if we don't exercise this script in CI, it will break quickly without us realizing it and we won't be able to use it in planning a version bump / release strategy.

dependecy graph

9ed7b1e

Signed-off-by: Polina Binder <[email protected]>

pstjohn reviewed Jan 25, 2025

View reviewed changes

dorotat-nv reviewed Jan 27, 2025

View reviewed changes

Merge branch 'main' into polinabinder/package_dependencies

3e5066b

dorotat-nv reviewed Jan 28, 2025

View reviewed changes

polinabinder1 force-pushed the polinabinder/package_dependencies branch 2 times, most recently from ccab862 to 94fd399 Compare January 28, 2025 22:00

more graphing of package dependencies

ad206c4

Signed-off-by: Polina Binder <[email protected]>

polinabinder1 force-pushed the polinabinder/package_dependencies branch from 94fd399 to ad206c4 Compare January 28, 2025 22:01

moving file

d4c7b08

Signed-off-by: Polina Binder <[email protected]>

polinabinder1 marked this pull request as ready for review January 28, 2025 22:04

polinabinder1 requested review from jstjohn, malcolmgreaves, sichu2023, skothenhill-nv, jomitchellnv, jwilber and cspades as code owners January 28, 2025 22:04

polinabinder1 requested review from pstjohn, dorotat-nv and trvachov January 28, 2025 22:05

pstjohn reviewed Jan 29, 2025

View reviewed changes

dorotat-nv approved these changes Jan 29, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dependency graph #659

Dependency graph #659

polinabinder1 commented Jan 24, 2025 •

edited

Loading

codecov-commenter commented Jan 25, 2025 •

edited

Loading

pstjohn Jan 25, 2025

dorotat-nv Jan 27, 2025

trvachov Jan 27, 2025

dorotat-nv Jan 28, 2025 •

edited

Loading

polinabinder1 Jan 28, 2025

dorotat-nv left a comment

trvachov commented Jan 27, 2025

dorotat-nv left a comment

polinabinder1 commented Jan 28, 2025

pstjohn left a comment •

edited

Loading

		return pyproject_files


		def parse_dependencies(pyproject_path):

Dependency graph #659

Are you sure you want to change the base?

Dependency graph #659

Conversation

polinabinder1 commented Jan 24, 2025 • edited Loading

codecov-commenter commented Jan 25, 2025 • edited Loading

Codecov Report

pstjohn Jan 25, 2025

Choose a reason for hiding this comment

dorotat-nv Jan 27, 2025

Choose a reason for hiding this comment

trvachov Jan 27, 2025

Choose a reason for hiding this comment

dorotat-nv Jan 28, 2025 • edited Loading

Choose a reason for hiding this comment

polinabinder1 Jan 28, 2025

Choose a reason for hiding this comment

dorotat-nv left a comment

Choose a reason for hiding this comment

trvachov commented Jan 27, 2025

dorotat-nv left a comment

Choose a reason for hiding this comment

polinabinder1 commented Jan 28, 2025

pstjohn left a comment • edited Loading

Choose a reason for hiding this comment

polinabinder1 commented Jan 24, 2025 •

edited

Loading

codecov-commenter commented Jan 25, 2025 •

edited

Loading

dorotat-nv Jan 28, 2025 •

edited

Loading

pstjohn left a comment •

edited

Loading